Health outcomes and demographic characteristics

Many demographic characteristics and health outcomes are correlated. In other words, many demographic characteristics (like age, household income, and race) and health outcomes (like physical health, specific diseases, etc.) are related in predictable ways.

Some relationships (correlations) between demographic characteristics and health outcomes are unavoidable. For example, as people age, their risk of developing cancer increases. So as you might expect, communities with more residents who are 65 and older also tend to have more residents who have cancer than younger communities. You can see this clearly in the scatter plot of Charlottesville region census tracts below.

xyscatter <- plot_ly(data = cdat, x = cdat$age65E, y = cdat$Cancer_except_skin2018,
                     type = "scatter",
                     mode = "markers",
                     size = ~totalpopE, sizes = c(1, 500),
                     color = ~countyname, colors = "Dark2",
                     alpha = .75,
                     text = paste0("Locality: ", cdat$countyname, "<br>",
                                   "Census tract: ", cdat$tract, "<br>",
                                   "Population: ", cdat$totalpopE, "<br>",
                                   "% with Cancer: ", cdat$Cancer_except_skin2018, "<br>",
                                   "% 65+: ", cdat$age65E),
                     hoverinfo = "text") %>%
      layout(xaxis = list(title = "Percent 65 and older", showticklabels = TRUE),
             yaxis = list(title = "Percent with cancer (excluding skin cancer)", showticklabels = TRUE),
             legend = list(orientation = "h", x = 0, y = -0.2))

xyscatter
Click here for variable details


  • Population size (the size of each data point): Small-area population estimates are from the Census Bureau’s American Community Survey 5-Year Estimates. Sent to approximately 3.5 million addresses per year, the 5-year survey estimates provide up-to-date estimates for localities that may be changing between censuses. As these are estimates derived from surveys, and not a full census, they are subject to variability due to sampling error.

  • Percent 65 and older: The percent of population 65 or older estimates the proportion of adults most likely to be retirement age. Source: U.S. Census Bureau, American Community Survey 5-year estimates 2015-2019.

  • Percent with cancer: Adjusted percent of respondents aged >= 18 years who report ever having been told by a health professional that they have any type of cancer except skin cancer. Not specific to cancer type. Based on being diagnosed and respondent recall of diagnosis, so might be underestimate. Source: CDC Places: Local Data for Better Health


In the plot above, each census tract in the Charlottesville region is represented with a dot (data point), which are color-coded by locality. The size of each dot is based on the population of the tract—tracts with more people are larger.

The plot above makes the positive correlation between age and cancer easy to see. Tracts that fall higher on the X-axis (percent of residents 65+ years old) also tend to fall higher on the Y-axis (percent of residents with cancer, except skin cancer). This trend means that as the percent of residents 65+ increases, so does the percent of residents with cancer. Aging over 65 years doesn’t mean that someone will develop cancer, but it does increase their chances.

More health outcomes

Scatter plots can make correlations easy to see, but you can also calculate a Pearson correlation coefficient to summarize the strength of a relationship. Correlation coefficients fall between -1 and 1, and the closer the number is to 0, the weaker the relationship. Cancer and age from the last example, which have a very strong correlation, have a correlation coefficient of 0.92. Variables can also be negatively correlated if when one score tends to be high, the other tends to be low. So a correlation of -0.92 would still be very strong!

We can summarize the correlations between many variables at once in a correlation plot like those shown in the tabs below. Each of these correlations was calculated based on the percent of residents in each tract who fall into each variable (e.g., “Diabetes” refers to the percent of the population diagnosed with diabetes).

To interpret each correlation coefficient, find the two variables in line with each box. For example, the value -0.68 in the violet square in the column for asthma and the row for income intersect means that there is a strong negative correlation between the percent of residents who have asthma and the median household income of each tract in the Charlottesville region—tracts with more high-income households tend to have fewer people with asthma. You can also find our 0.92 correlation coefficient from earlier where 65+ intersects with cancer under the “Age” tab. The variables in each plot are abbreviated, but you can find the full names and more information about them by clicking on the drop down in each tab.

Many of the relationships shown below cannot be explained as neatly as the relationship between age and cancer. For example, the percent of Black residents is negatively correlated with the percent of residents with health insurance and household income. In other words, tracts with a large proportion of Black residents also tend to have lower household income and fewer people with health insurance.

econcordat <- cdat %>%
  select(hhincE, hlthinsE, unempE,
         COPD2018, Current_Asthma2018, Diabetes2018, Obesity2018,
         Mental_Health2018, Physical_Health2018, Cancer_except_skin2018) %>%
  rename("Income" = hhincE,
         "Insurance" = hlthinsE,
         "Jobless" = unempE,
         "COPD" = COPD2018,
         "Asthma" = Current_Asthma2018,
         "Diabetes" = Diabetes2018,
         "Obesity" = Obesity2018,
         "Poor MH" = Mental_Health2018,
         "Poor PH" = Physical_Health2018,
         "Cancer" = Cancer_except_skin2018)

racecordat <- cdat %>%
  select(blackE, whiteE, ltnxE, asianE, 
         COPD2018, Current_Asthma2018, Diabetes2018, Obesity2018,
         Mental_Health2018, Physical_Health2018, Cancer_except_skin2018) %>%
  rename("White" = whiteE,
         "Black" = blackE,
         "Hispanic" = ltnxE,
         "Asian" = asianE,
         "COPD" = COPD2018,
         "Asthma" = Current_Asthma2018,
         "Diabetes" = Diabetes2018,
         "Obesity" = Obesity2018,
         "Poor MH" = Mental_Health2018,
         "Poor PH" = Physical_Health2018,
         "Cancer" = Cancer_except_skin2018)

agecordat <- cdat %>%
  select(age17E, age24E, age64E, age65E,
         COPD2018, Current_Asthma2018, Diabetes2018, Obesity2018,
         Mental_Health2018, Physical_Health2018, Cancer_except_skin2018) %>%
  rename("<=17" = age17E,
         "18-24" = age24E,
         "25-64" = age64E,
         "65+" = age65E,
         "COPD" = COPD2018,
         "Asthma" = Current_Asthma2018,
         "Diabetes" = Diabetes2018,
         "Obesity" = Obesity2018,
         "Poor MH" = Mental_Health2018,
         "Poor PH" = Physical_Health2018,
         "Cancer" = Cancer_except_skin2018)

climatedat <- cdat %>%
  select(ba2000E, housing_risk, pm2_5_2016,
         COPD2018, Current_Asthma2018, Diabetes2018, Obesity2018,
         Mental_Health2018, Physical_Health2018, Cancer_except_skin2018) %>%
  rename("New homes" = ba2000E, 
         "Lead risk" = housing_risk,
         "Pollution" = pm2_5_2016,
         "COPD" = COPD2018,
         "Asthma" = Current_Asthma2018,
         "Diabetes" = Diabetes2018,
         "Obesity" = Obesity2018,
         "Poor MH" = Mental_Health2018,
         "Poor PH" = Physical_Health2018,
         "Cancer" = Cancer_except_skin2018)

cormat1 <- as.data.frame(round(cor(econcordat, use = "pairwise.complete.obs"),2))
cormat2 <- as.data.frame(round(cor(racecordat, use = "pairwise.complete.obs"),2))
cormat3 <- as.data.frame(round(cor(agecordat, use = "pairwise.complete.obs"),2))
cormat4 <- as.data.frame(round(cor(climatedat, use = "pairwise.complete.obs"),2))

# Getting rid of extra rows and columns so that the demographic variables 
# are rows and health outcomes are columns 
cormat1 <- cormat1[1:3, 5:10]
cormat2 <- cormat2[1:4, 5:11]
cormat3 <- cormat3[1:4, 5:11]
cormat4 <- cormat4[1:3, 4:10]

py$cormat1 = r_to_py(cormat1)
py$cormat2 = r_to_py(cormat2)
py$cormat3 = r_to_py(cormat3)
py$cormat4 = r_to_py(cormat4)

Socio-economic characteristics

import seaborn as sns
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

sns.heatmap(cormat1, fmt="g", cmap ='viridis', annot = True,vmin=-1, vmax=1, center=0, linewidths=1, linecolor='white',cbar=True, square=True)
plt.show()

Click here for variable details


Source: CDC Places: Local Data for Better Health

  • COPD: Adjusted percent of respondents aged >= 18 years who report ever having been told by a health professional that they had COPD, emphysema, or chronic bronchitis. Based on being diagnosed and respondent recall of diagnosis, so might be underestimate.

  • Asthma: Adjusted percent of respondents aged >= 18 years who answer “yes” to both of the following questions: (1) “Have you ever been told by a doctor, nurse, or other health professional that you have asthma?” and (2) “Do you still have asthma?” This indicator requires doctor diagnosis, which may not include all persons with asthma.

  • Diabetes: Adjusted percent of respondents aged >= 18 years who report ever having been told by a health professional that they had diabetes (other than diabetes during pregnancy). Based on being diagnosed and respondent recall of diagnosis, so might be underestimate.

  • Obesity: Adjusted percent of respondents aged >= 18 years who have a BMI >= 30 kg/m^2 calculated from self-reported weight and height excluding respondents who were <3ft tall or >= 8ft; weighed <50lbs or >= 650 lbs; BMI < 12 or >= 100; pregnant women. Self-reports of height and weight lead to lower BMI estimates compared to height and weight measurements.

  • Poor MH: Adjusted percent of respondents aged >= 18 years who report 14 or more days during the past 30 days during which their mental health was not good. Based self-assessment only and does not have an objective health component, so it’s difficult to assess reliability and validity.

  • Poor PH: Adjusted percent of respondents aged >= 18 years who report 14 or more days during the past 30 days during which their physical health was not good. Based self-assessment only and does not have an objective health component, so it’s difficult to assess reliability and validity.

  • Cancer: Adjusted percent of respondents aged >= 18 years who report ever having been told by a health professional that they have any type of cancer except skin cancer. Not specific to cancer type. Based on being diagnosed and respondent recall of diagnosis, so might be underestimate.

Source: U.S. Census Bureau, American Community Survey 5-year estimates 2015-2019.

  • Income: The American Community Survey measures income at the household level, capturing income in the last 12 months of all individuals 15 and older in the household. The median household income is the income threshold that divides households into two halves – with half of the households below the value and half of the households above the value.

  • Insurance: The American Community Survey asks if an individual is currently covered by any type of health insurance – provided by an employer or union; purchased directy from an insurance company; Medicare, Medicaid, military-provided, or VA-provided; or any type of health coverage plan. Individuals answering yes to any of these are considered to have health insurance.

  • Jobless: The American Community Survey estimates unemployment only among individuals 16 years and older who are in the labor force. Individuals who have never worked or who are retired are not in the labor forace; individuals who are actively working are in the labor force and employed; individuals who are not actively working but who have recently worked and would like to work are in the labor force and unemployed.


Racial demographics

sns.heatmap(cormat2, fmt="g", cmap ='viridis', annot = True,vmin=-1, vmax=1, center=0, linewidths=1, linecolor='white',cbar=True, square=True)
plt.show()

Click here for variable details


Source: CDC Places: Local Data for Better Health

  • COPD: Adjusted percent of respondents aged >= 18 years who report ever having been told by a health professional that they had COPD, emphysema, or chronic bronchitis. Based on being diagnosed and respondent recall of diagnosis, so might be underestimate.

  • Asthma: Adjusted percent of respondents aged >= 18 years who answer “yes” to both of the following questions: (1) “Have you ever been told by a doctor, nurse, or other health professional that you have asthma?” and (2) “Do you still have asthma?” This indicator requires doctor diagnosis, which may not include all persons with asthma.

  • Diabetes: Adjusted percent of respondents aged >= 18 years who report ever having been told by a health professional that they had diabetes (other than diabetes during pregnancy). Based on being diagnosed and respondent recall of diagnosis, so might be underestimate.

  • Obesity: Adjusted percent of respondents aged >= 18 years who have a BMI >= 30 kg/m^2 calculated from self-reported weight and height excluding respondents who were <3ft tall or >= 8ft; weighed <50lbs or >= 650 lbs; BMI < 12 or >= 100; pregnant women. Self-reports of height and weight lead to lower BMI estimates compared to height and weight measurements.

  • Poor MH: Adjusted percent of respondents aged >= 18 years who report 14 or more days during the past 30 days during which their mental health was not good. Based self-assessment only and does not have an objective health component, so it’s difficult to assess reliability and validity.

  • Poor PH: Adjusted percent of respondents aged >= 18 years who report 14 or more days during the past 30 days during which their physical health was not good. Based self-assessment only and does not have an objective health component, so it’s difficult to assess reliability and validity.

  • Cancer: Adjusted percent of respondents aged >= 18 years who report ever having been told by a health professional that they have any type of cancer except skin cancer. Not specific to cancer type. Based on being diagnosed and respondent recall of diagnosis, so might be underestimate.

Source: U.S. Census Bureau, American Community Survey 5-year estimates 2015-2019.

  • White: The American Community Survey allows individuals to select as many as six race options and Hispanic/Latino is captured separately. The percent white refers to individuals who identified themselves only as white (no other races) and as not Hispanic or Latino.

  • Black: The American Community Survey allows individuals to select as many as six race options and Hispanic/Latino is captured separately. The percent black refers to individuals who identified themselves only as black or African American (no other races) and as not Hispanic or Latino.

  • Hispanic: The American Community Survey captures Hispanic/Latino ethnicity separately from race. The percent Hispanic refers to individuals who identified as Hispanic along with any other race.

  • Asian: The American Community Survey allows individuals to select as many as six race options and Hispanic/Latino is captured separately. The percent Asian refers to individuals who identified themselves as one of Asian Indian, Chinese, Filipino, Japanese, Korean, Vietnamese, or other Asian (no other races) and as not Hispanic or Latino.


Age

sns.heatmap(cormat3, fmt="g", cmap ='viridis', annot = True,vmin=-1, vmax=1, center=0, linewidths=1, linecolor='white',cbar=True, square=True)
plt.show()

Click here for variable details


Source: CDC Places: Local Data for Better Health

  • COPD: Adjusted percent of respondents aged >= 18 years who report ever having been told by a health professional that they had COPD, emphysema, or chronic bronchitis. Based on being diagnosed and respondent recall of diagnosis, so might be underestimate.

  • Asthma: Adjusted percent of respondents aged >= 18 years who answer “yes” to both of the following questions: (1) “Have you ever been told by a doctor, nurse, or other health professional that you have asthma?” and (2) “Do you still have asthma?” This indicator requires doctor diagnosis, which may not include all persons with asthma.

  • Diabetes: Adjusted percent of respondents aged >= 18 years who report ever having been told by a health professional that they had diabetes (other than diabetes during pregnancy). Based on being diagnosed and respondent recall of diagnosis, so might be underestimate.

  • Obesity: Adjusted percent of respondents aged >= 18 years who have a BMI >= 30 kg/m^2 calculated from self-reported weight and height excluding respondents who were <3ft tall or >= 8ft; weighed <50lbs or >= 650 lbs; BMI < 12 or >= 100; pregnant women. Self-reports of height and weight lead to lower BMI estimates compared to height and weight measurements.

  • Poor MH: Adjusted percent of respondents aged >= 18 years who report 14 or more days during the past 30 days during which their mental health was not good. Based self-assessment only and does not have an objective health component, so it’s difficult to assess reliability and validity.

  • Poor PH: Adjusted percent of respondents aged >= 18 years who report 14 or more days during the past 30 days during which their physical health was not good. Based self-assessment only and does not have an objective health component, so it’s difficult to assess reliability and validity.

  • Cancer: Adjusted percent of respondents aged >= 18 years who report ever having been told by a health professional that they have any type of cancer except skin cancer. Not specific to cancer type. Based on being diagnosed and respondent recall of diagnosis, so might be underestimate.

Source: U.S. Census Bureau, American Community Survey 5-year estimates 2015-2019.

  • <=17: The percent of population 17 or younger estimates the proportion of children in the population, ages 0 to 17.

  • 18-24: The percent of population 18 to 24 estimates the proportion of young adults most likely to be a traditional college age in the population.

  • 25-64: The percent of population 25 to 64 estimates the proportion of adults most likely to be working age in the population.

  • 65+: The percent of population 65 or older estimates the proportion of adults most likely to be retirement age.


Environmental factors

sns.heatmap(cormat4, fmt="g", cmap ='viridis', annot = True,vmin=-1, vmax=1, center=0, linewidths=1, linecolor='white',cbar=True, square=True)
plt.show()

Click here for variable details


Source: CDC Places: Local Data for Better Health

  • COPD: Adjusted percent of respondents aged >= 18 years who report ever having been told by a health professional that they had COPD, emphysema, or chronic bronchitis. Based on being diagnosed and respondent recall of diagnosis, so might be underestimate.

  • Asthma: Adjusted percent of respondents aged >= 18 years who answer “yes” to both of the following questions: (1) “Have you ever been told by a doctor, nurse, or other health professional that you have asthma?” and (2) “Do you still have asthma?” This indicator requires doctor diagnosis, which may not include all persons with asthma.

  • Diabetes: Adjusted percent of respondents aged >= 18 years who report ever having been told by a health professional that they had diabetes (other than diabetes during pregnancy). Based on being diagnosed and respondent recall of diagnosis, so might be underestimate.

  • Obesity: Adjusted percent of respondents aged >= 18 years who have a BMI >= 30 kg/m^2 calculated from self-reported weight and height excluding respondents who were <3ft tall or >= 8ft; weighed <50lbs or >= 650 lbs; BMI < 12 or >= 100; pregnant women. Self-reports of height and weight lead to lower BMI estimates compared to height and weight measurements.

  • Poor MH: Adjusted percent of respondents aged >= 18 years who report 14 or more days during the past 30 days during which their mental health was not good. Based self-assessment only and does not have an objective health component, so it’s difficult to assess reliability and validity.

  • Poor PH: Adjusted percent of respondents aged >= 18 years who report 14 or more days during the past 30 days during which their physical health was not good. Based self-assessment only and does not have an objective health component, so it’s difficult to assess reliability and validity.

  • Cancer: Adjusted percent of respondents aged >= 18 years who report ever having been told by a health professional that they have any type of cancer except skin cancer. Not specific to cancer type. Based on being diagnosed and respondent recall of diagnosis, so might be underestimate.

Source: U.S. Census Bureau, American Community Survey 5-year estimates 2015-2019.

  • New homes: The percent of housing units in a tract built after 2000. Calculated by adding the number of housing units built from 2000-2009, 2010-2014, and after 2014 and then dividing by the total number of housing units.

  • Lead risk: Estimated percent of houses at risk for lead. Derived from U.S. Census Bureau, American Community Survey 5-year estimates 2015-2019 following a methods developed by the Washington State Department of Health and Vox.

  • Pollution: Air pollution, via the concentrations of fine particulate matter that is less than 2.5 micrometers in diameter (PM2.5), at each census tract. PM2.5 concentrations are measured by the number of micrograms per cubic meter. High concentrations of PM2.5 indicate higher levels of air pollution. Here we use the estimated concentration of PM2.5 in 2016. Source: Replication Data for: Disparities in PM2.5 air pollution in the United States


Geographic concentration

The demographic characteristics associated with poorer health outcomes and the health outcomes themselves tend to be concentrated within specific tracts. In other words, in tracts where one variable is particularly high or low, poorer health outcomes tend to also be particularly high. For example, tracts that have lowest median household income also tend to have the highest rates of a variety of poor health outcomes.

In the bar graphs below, each of the Charlottesville region tracts has been broken into terciles (groups of three) based on their median household income. Each tercile was then ranked (1-3) from low-income to high-income tracts. We then calculated the average, or mean, of the variable on the Y-axis.

(Note: Tract 10903 in Albemarle County has been removed from the graphs below because it includes UVa’s campus, where most students live. The median household income for students is so low that it obscures interpretation for tracts that aren’t majority students.)

Physical health, mental health, asthma, sleep, and income

dat <- cdat
dat1 <- bi_class(dat, x = hhincE, y = lowwage_p, style = "quantile", dim = 3)
dat1$incrank <- stri_extract(dat1$bi_class, regex = '^\\d{1}(?=-\\d)')

dat1 <- dat1[-which(dat1$GEOID == "51003010903"),]

dat2 <- dat1 %>%  
  dplyr::select(totalpopE, Mental_Health2018, incrank) %>%
  filter(!is.na(Mental_Health2018), !is.na(incrank)) %>% 
  mutate(mh_num = totalpopE*(Mental_Health2018/100)) %>% 
  group_by(incrank) %>% 
  summarize(mh_num = sum(mh_num),
            num = sum(totalpopE),
            mh_per = round((mh_num/num)*100,1))

mentalhealthplot <- dat2 %>% 
  ggplot(aes(incrank, mh_per, fill=incrank)) + 
  geom_bar(position = "dodge", stat = "identity") + 
  scale_fill_manual(labels = c("low-income", "mid-income", "high-income"),
                    values = c('#22A884FF', '#414487FF', '#440154FF'),
                    name = "Household income rank") + 
  labs(x = "Rank", y = "% of population with poor mental health")

dat4 <- dat1 %>%  
  dplyr::select(totalpopE, Current_Asthma2018, incrank) %>%
  filter(!is.na(Current_Asthma2018), !is.na(incrank)) %>% 
  mutate(ca_num = totalpopE*(Current_Asthma2018/100)) %>% 
  group_by(incrank) %>% 
  summarize(ca_num = sum(ca_num),
            num = sum(totalpopE),
            ca_per = round((ca_num/num)*100,1))

asthmaplot <- dat4 %>%  
  ggplot(aes(incrank, ca_per, fill=incrank)) + 
  geom_bar(position = "dodge", stat = "identity") + 
  scale_fill_manual(labels = c("low-income", "mid-income", "high-income"),
                    values = c('#22A884FF', '#414487FF', '#440154FF'),
                    name = "Household income rank") + 
  labs(x = "Rank", y = "% of population with asthma")

dat6 <- dat1 %>%  
  dplyr::select(totalpopE, Physical_Health2018, incrank) %>%
  filter(!is.na(Physical_Health2018), !is.na(incrank)) %>% 
  mutate(ph_num = totalpopE*(Physical_Health2018/100)) %>% 
  group_by(incrank) %>% 
  summarize(ph_num = sum(ph_num),
            num = sum(totalpopE),
            ph_per = round((ph_num/num)*100,1))

physicalhealthplot <- dat6 %>% 
  ggplot(aes(incrank, ph_per, fill=incrank)) + 
  geom_bar(position = "dodge", stat = "identity") + 
  scale_fill_manual(labels = c("low-income", "mid-income", "high-income"),
                    values = c('#22A884FF', '#414487FF', '#440154FF'),
                    name = "Household income rank") + 
  labs(x = "Rank", y = "% of population with poor physical health")

ggarrange(physicalhealthplot, mentalhealthplot, asthmaplot, ncol = 3, common.legend = TRUE)

Click here for variable details


Source: CDC Places: Local Data for Better Health

  • % of population with poor mental health: Adjusted percent of respondents aged >= 18 years who report 14 or more days during the past 30 days during which their mental health was not good. Based self-assessment only and does not have an objective health component, so it’s difficult to assess reliability and validity.

  • % of population with poor physical health: Adjusted percent of respondents aged >= 18 years who report 14 or more days during the past 30 days during which their physical health was not good. Based self-assessment only and does not have an objective health component, so it’s difficult to assess reliability and validity.

  • % of population with asthma: Adjusted percent of respondents aged >= 18 years who answer “yes” to both of the following questions: (1) “Have you ever been told by a doctor, nurse, or other health professional that you have asthma?” and (2) “Do you still have asthma?” This indicator requires doctor diagnosis, which may not include all persons with asthma.

Source: U.S. Census Bureau, American Community Survey 5-year estimates 2015-2019.

  • Median household income: The American Community Survey measures income at the household level, capturing income in the last 12 months of all individuals 15 and older in the household. The median household income is the income threshold that divides households into two halves—with half of the households below the value and half of the households above the value.


Another way to visualize the concentration of some of these outcomes within specific tracts is through a cartogram, where each tract is re-sized based on its score on a variable rather than it’s actual area. The gifs below transition between the normal map of the Charlottesville area census tracts and a cartogram where tracts are re-sized by percent of the population in each census tract who have poor mental health and the percent with poor physical health.

dat2 <- shape %>%
  left_join(cdat, by = "GEOID")
dat2$id <- rep("Normal Geographic Boundaries", 50)

cart1 <- cartogram_cont(dat2,
                        weight = "Mental_Health2018",
                        itermax = 18,
                        prepare = "remove",
                        threshold = 1)

cart1dat <- as.data.frame(cart1)
cart1dat$id <- rep("Re-sized by % with Poor Mental Health", 50)

transdat <- dat2 %>%
  bind_rows(cart1dat)

## Basic version 
anim <- ggplot(transdat) + geom_sf(aes(fill = Mental_Health2018)) + 
  scale_fill_viridis() + 
  transition_states(id, transition_length = 3, wrap = T) + 
  theme_void() + 
  labs(title = "{closest_state}", fill = "% with poor mental health")

anim

Click here for variable details


Source: CDC Places: Local Data for Better Health

  • % with poor mental health: Adjusted percent of respondents aged >= 18 years who report 14 or more days during the past 30 days during which their mental health was not good. Based self-assessment only and does not have an objective health component, so it’s difficult to assess reliability and validity.


When re-scaled based on the percentage of residents who report having poor mental health, several tracts within Charlottesville city grow dramatically in size. Overall, rates of poor mental health are much higher within Charlottesville city relative to the surrounding localities.

What’s the deal with the white spaces? 1

cart2 <- cartogram_cont(dat2,
                        weight = "Physical_Health2018",
                        itermax = 4,
                        prepare = "remove",
                        threshold = 0)

cart2dat <- as.data.frame(cart2)
cart2dat$id <- rep("Re-sized by % with Poor Physical Health", 50)

transdat2 <- dat2 %>%
  bind_rows(cart2dat)

## Basic version 
anim2 <- ggplot(transdat2) + geom_sf(aes(fill = Physical_Health2018)) + 
  scale_fill_viridis() + 
  transition_states(id, transition_length = 3, wrap = T) + 
  theme_void() + 
  labs(title = "{closest_state}", fill = "% with poor physical health")

anim2

Click here for variable details


Source: CDC Places: Local Data for Better Health

  • % with poor physical health: Adjusted percent of respondents aged >= 18 years who report 14 or more days during the past 30 days during which their physical health was not good. Based self-assessment only and does not have an objective health component, so it’s difficult to assess reliability and validity.


Unlike the cartogram above showing rates of poor mental health, the rates of poor physical health are much higher in the localities surrounding Charlottesville City—Nelson and Greene counties especially—so the distortion of Charlottesville City tracts is not as stark.

Conclusion

The series of visualizations above highlights how one’s access to resources are related to important health outcomes. These data do not show a causal explanation. In other words, we’re unable to say, “Having a lower household income causes someone to have poor mental health”. However, it is easy to imagine how having a lower household income adds additional stress to one’s life—anxiety about affording housing, childcare, health care, etc.—so, communities with lower average household incomes tend to have higher levels of poor mental health.


  1. The white spaces in the reshaped cartogram are just a result of newly scaled census tracts not fitting together perfectly. They do not represent a specific tract/geographic area.↩︎